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The cDNAs encoding human prostatic acid phosphatase were cloned and 
characterized. The mRNAs contain 3' noncoding regions of heterogeneous sizes 
546, 1887 or 1913 nucleotides. A dimer and a monomer of the conserved 
Mu-repeats are present in the longer 3' noncoding sequences. The complete 
sequence of 354 amino acids for the mature enzyme was determined by sequencing 
Doth cDNA and protein. Human prostatic and lysosomal acid phosphatases 
exhibit 50% sequence homology, including five Cys residues and two putative 
viinked glycosy 1 ation sites. The Acp-3 gene coding for human prostatic acid 
:nosphatase was mapped onto chromosome 3 in this investigation. The Acp-2 
9ene coding for lysosomal acid phosphatase has previously been located on 
rriromosome 11, while the Acp-1 gene coding for red blood cell acid phosphatase 
'5 on chromosome 2. 



Acid phosphatases (Acp; EC 3.1.3.2) are a group of enzymes capable 
-* hydrolysing phosphomonoesters under acidic conditions, and can be 
differentiated according to their immunological properties, tissue 
distribution and subcellular location(l). Since Gutman et al. (1936) first 
"eoorted an elevation of serum acid phosphatase activity in patients with 
metastatic prostate cancer (2), human prostatic acid phosphatase (PAP) has 
"een used as a diagnostic marker for prostate cancer (3). The physiological 
■unction of PAP is not well understood, but recent reports suggested that PAP 
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may function U\ vivo as a phosphotyrosyl-protein phosphatase (4). The PAP is 
synthesized under androgen regulation by epithelial cells of the prostate, 
and it is secreted into the seminal fluid (5). In order to study the protein 
structure-function and mechanism of androgen regulation, we have undertaken 
the cDNA cloning and protein sequencing of human PAP. In this report, we 
describe the cDNA cloning and gene mapping of human PAP as well as protein 
sequence homology between human prostatic and lysosomal acid phosphatases. 

Materials and Methods 

Protein sequencing: 

Human PAP protein was purified (6) and cleaved with CNBr and/or Lys-C 
protease (7). The resulting peptides were separated by HPLC microbore 
RP-C8 column using a linear gradient of acetonitrile (0-70% in 60 min) in 
0.1% trif luoroacetic acid (8). The ami no-terminal sequences of the mature 
enzyme and purified peptides were determined by automatic micro-sequencer 
equipped with on-line HPLC and data module (Applied Biosystems, Inc.)* 

cDNA cloning and sequencing: 

A human prostate Agtll cDNA expression library established from benign 
prostatic hyperplasia (9) was screened with rabbit antiserum against human PAP 
as previously described (10-12). The putative positive clones were purified 
to homogeneity using polyclonal and monoclonal antibodies against human PAP. 
The- DNAs purified from the positive clones were analyzed by restriction 
endonuclease mapping and Southern blotting (13). The DNA fragments derived 
from EcoRI cleavage, as well as overlapping deletions constructed using an 
erase-a-base kit (Promega), were further subcloned into M13 mpl8 phage (14), 
and the nucleotide sequences of the inserted DNAs were determined by the 
dideoxy chain termination method with the sequencing protocol modified to use 
deoxyadenosine 5-(a-35S) thiotriphosphate (15). 

Gene mapping: 

Human chromosomes were isolated and flow-sorted into 24 spots on 
nitrocellulose discs as previously described (16). These spot-blots and human 
genomic ONA blot were hybridized with 32P-labeled PAP cDNA probe as indicated 
in Figure 1. The conditions for hybridization and washing have been described 
previously (13,17). 



Resul ts 

Protein sequencing: 

The ami no- terminal sequence of 45 residues from the purified PAP protein 
was obtained by micro-sequencing. A total of 80 amino acids (residue nos. 
44-65, 148-153, 203-219, 240-251, 254-265 and 298-308) was also sequenced 
from four CNBr peptides and two Lys-C peptides (Fig. 2A). The amino acid 
sequence of one CNBr peptide was found to overlap with the amino-terminal 
sequence of the PAP protein. Thus, the sequence of the first 65 amino acids 
from mature PAP enzyme was unambigously established. 

cDNA cloning and sequencing: 

Six putative PAP cDNA clones were plaque-purified using polyclonal and 
monoclonal antibodies against PAP, and their inserted DNAs were shown to 
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Fig. 1. Restriction endonuclease map and sequencing strategy of human PAP 
cDNA clones. 

The restriction sites given (R t EcoRI; P, PstI and H, Hindi II) are 
those used in nucleotide sequencing and preparation of hybridization probe. 
The direction and length of each sequencing run are indicated by arrows. The 
protein-coding sequence is denoted by open box. The dimeric and monomeric 
Alu-repeats are shown by hatched boxes. 



cross-hybridize with each other (data not presented). The EcoRI DNA fragments 
of these six clones were subcloned into M13 mpl8 phages, and their nucleotide 
sequences determined as indicated in Figure 1. The combined nucleotide 
sequence of these PAP cDNAs contain an open reading frame. The first six 
amino acids Phe-Leu-Asn-Gl u-5er-Tyr , which were deduced from the cDNA 
sequence, were identical with residue nos. 60 to 65 from the ami no-termi nal 
sequence of the purified PAP protein (Fig. 2A). The amino acid sequences of 
five peptides also confirmed the cDNA-deduced sequences of residue nos. 
148-153, 203-219, 240-251, 254-265 and 298-308. Therefore, the purified PAP 
enzyme consists of a sequence of 354 amino acids. Since these six PAP cDNAs 
did not possess the sequence coding for the first 59 amino acids, an 
EcoRI-PstI cDNA fragment of coding region (as indicated in Fig. 1) was further 
used as a probe to isolate full-length PAP cDNA. However, none of the 
additional 24 PAP cDNA clones were found to contain the coding sequence 5' to 
codon no. 60. 

As to the 3* noncoding sequence of the PAP cDNAs (Fig. 3B), clone hP7 
contained a poly (A)-ta1l of 51 residues added at nucleotide no. 646, and a 
putative polyadenylation signal of AATTAA was found at 20 nucleotides 5' to 
this poly (A) addition site. Clones hP26 and hP37 appeared to contain 
Identical cDNA sequence ending at nucleotide no. 1126 without poly (A)-ta1l. 
Clone hP16 did not possess a poly (A)-tail and terminated at nucleotide no. 
1442. Clone hP40 contained a poly (A)-tail of 20 residues added at nucleotide 
no. 1887, while clone hp38 had a poly (A)-ta1l of 41 residues added at 
nucleotide no. 1913. Both hP40 and hP38 clones contained a common 
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ggccagaaacagctctectcaacatgaga 
-32 HetArg 

a gct9cacccctcctcctg9ccagg9Ca9ci<QCttagcctt99cttcttgtttctgcttttttt9Ctggcta9acc9«»gt9tacta9cc 
AlaAlaProLeuLeuLeuAlaArgAiaAUSerLeuAULeuAlaSerCysPneCysPhePheCysTrpLeuAspArgSerVaReuAU 

aag9agttgaa9tttgtgacttt99t9tttcg9Ut99«9acc9a.agtcccatt9acacctttcccactg4ccccat4aa99aatcctca 
1 LysGiuLeuL^sPheValThrLeuvalPhtArqHlsGlyAtpArgSerProrieAapThrPheProTfirAspPfoIleLyaflluSerSer 

tggcca taagga t ttggccaa c t ca c c cage tgggca tggag ca gca t ta tgaac t tgga gag ta ta ta agaaagaga ta tagaaAATTC 
31 TrpProGtftG)yPheGTyfitffLeuThrtTnLcuG)yWetG^uG)nHlsTyi^)uleuGlyG)uTvrI)eArgL>sArgTyrArgLysPhe 

TTGMTGACTCCTATAAACATGAACAG&TTTATATTCGAAGCACAGACGTTGACCGGACTTTGATGAGT 
61 LeuAsnGljSerTyr lyaHUGluGlnVaUyrneArgSerThrAspValAspArgThrLcuMetSerAlaWetThrAsnLeuAUAta 

CTGTTTCCCCCAGAA6GTGTCAGCATCT66VWTCCTA7CC7AC7C7GGCAGCCCA7CCCK 
91 LeuPheProProGluGlyValSerlleTrpAsnProI leLeuLeuTroGlnProI JeProValHtsThrValProLeuSerGluAspGln 

T7GCTATACC7GCCTnCAGGAACTGCCC7CGnT7CAACJM^ 
121 irutmuifrt *uProl>figArnAsnCysProAroPheG)nGluLeuGl uSerGl ulhrleuLysSerGluGluPheGl nlys ArgLetiHl 5 

CCTTATAAGGATTTTATAGCTACCTTGGGAAAACTTTCAI^TTACATGGCCAfiGACCTTTTTGGAATTTGG^ 
151 ProTyrLys AspPheneAlaThrLeuG]yLysLeuSerGlyLeuHisGlyGlfiAspLeuPt>cG)yneTrpSerLy>vanyrAspPro 

TTATAnCT6A6AGT6TTCACMTTTCACTT7ACCCTCCTG6GCCAC7GAGGA£ACCATGACTA 
181 L»uTyr<:y4GluSeryalH1sAsnPheThrL<uPro5erTrpA)aTrtrGluAspTftrWetThrtya LeuArgGtuLftuSerG1uLeuSgr 

CTCC7G7CCCTC7A7GGAAnCACA«CAGAAAGAGAMTCTAG6CK 
211 LeuLeuSerLeuTryGTyn»HlsLy» GloLy3GlutysSerArgLCMG)nGlyGlyVaiLeuya1AsnG1uneteuAsnH1&>«<tLj>s 

AGAGCA«TCAGA7A£CAAGCTACAAAAAA£TTATCATG7A7 
241 ArgAUThrGlnlUProSerTyrtyslysLeu neWct TyrSerAlaHlsAspThrThrValSerGlyLtuG ltiWctAlaLtuAspVal 

TACAACGGACTCCTTCCTCCCTATGCTTCTTGCCAC77GACGGMTTGTACTTTGA6AAGGGGGAGTACTTTGTGGAGATGTAC7ACCGG 
271 Tyr^nfiiyLtuLeuProProTyrAtaSerCysHtsLeurhrGtuLeuTyrPheGluLysGtyGluTyrPhevalGluMet TryTryArg 

AATGAGACGCAGCACGAGCCGTATCCCCTCATGCTACCTGGCTGCAGCCCCAGCTGTCCTCTGGAGAGGTTTGCTGAGCTGGTTGGCCCT 
301 AsnGtuThrOnH^sGluPro TyrProieuWetieuProGlyCysSerProSerCysProleuGluArgPheAlaGluLeuValGlyPrc 

GTGATCCCTCAAGACTGGTCCACGGAGGTTATGACCACAAACAGCCATCAAGG7ACTGAGGACAGTACAGATTAG 
331 vaHleProGlrAspTrpSerThrtluValKetThrThrAsnSerHUGlnGlyThrGluAspSerThrAsp 354 

10 20 30 40 50 60 70 B0 90 100 

B 1 TGTGCACAGAGATCTCTGTAGAAAGAGTAGCTGCCCTTTCTCAGGGCAGATGATGCTTTGAGAACATACTTTGGCCATTACCCCCCAGCTTTGAGGAAM 

101 TGGGC7TTGGATGAT7ATTTTATGnTTAG4GGACCCCCAACCTCAGGCAATTCCATCC7CTTCACCCGACCCTGCCCCCACTTGCCATAAAACTTAGCT 

201 AAG7TT7GTT7TGTTTTTCAGCGT7AATG7AAAGGGGCAGCAGTGCCAAAATATAACAGAGA7AAAGCT7AGGTCAAAG7TCATAGAGT7CCCATGAAC7 

301 A7ATGACTGGCCACACAGGATC7TTT67ATTTAAG6A7TCT6AGATTTTGCTTGAGCA6GATTAGArAAGGCTGTTCTr7AMTGTC76AAA7G6AACAG 

401 ATTTCAAAAAAAACCCCACAAT CTAGGGTGGGAACAAGGAAGGAAAGAT GTGAAT AGGC7GA7GGGCAAAAAACCAAT T T ACCCATCAGT7CCAGCCTT C 

501 TC7CAAG6AGAGGCAAAGAMGGAGATACMTGGAGKATCTG6AAAGTTTTC7CCACTGGAAAAC7GC7AC7A7 

601 7A7GAGGC7ACAGAAC7AAj AAA77AAA A£C7C777G7G7CCCT7CG7CCTGGAACA7T7A7G77CC777TAAAGAAACAAAAA7CAAAC7T7AC 

701 AT7TGATGTATGTAA7ACATATWXA6CTC7TGAAGTATATATATCA7AGCAAA7AAGTCATCTGATGAGAACAAGCT 

fiOl *AGAGAGCACCACGTGATGGAGT7TCTCCAGAAGCTCCAG76ATAAGAGA7G7TGACTCT AAAGTTGATTTAAG^ 

901 ATCCCACCAT7TTGCGAGTCCGAGGTGGGCAGATCAC7TGAGCTCAGGAGGTCAAGATCAGCCTGGGCAACATGGTGAAACCTTGTC7CTA 

1001 CAAAMCTTAGATGGGCATGGTGGT6TGTGCCTATAG7CCACTACTTGTGGGGCTAAGGCAGGAGGATCACTTGAGCCCCGGA6GTCGAGGCTACAGTGA 

1201 TAAAAACAAAG7TGAT TAACA AAGGAAGTATAGGCTAG<KACAGTGGCTCACACCTGTAATCCTTGCATTTTGGAAGGCTGAGGCAG 

1301 6GCCT6C76TGTTCAAfiACCA6CCT6G7CAACATA67GA6ACACTGTC7C7ACCAAAAA^ 

1401 AA7GTAA7TA7GT7ATGTTCTAAG7GCC7CCAAGTTCAAAAC7TATTGjGAATGT7GAGAGTGTGG77ACGAAATACGTT 

1501 AAGTCTTTAATGCCC6ATATCT7CAGAAAACCTAAGCAAACTTACAGGTCCTGCTGAAACTGCCCACTCTGCAA6AAGAAATCATGATAT 

1601 TG7GGCAGATCTACA7GTCTAGAGAACAC76TGC7C7ATTACCATTATGGATAAAGATGAGATGG777C7AGAGATGG7TTCTACTGGCTGCCAGAATCT 

1701 AGAGCAAA6CCATCCCCGCTCC7GGT1G€TCACAGAATGACTGACAAAGACATCG>T7GATA7GCTTCTTTGTGTTATTTCCCTCCCAAG7A>ATG 

1801 7CC77GGG7CCA7777CTA7GC77G7AAC7G7C77C7AGCAG7GAGCCAM7G7AAAATAG76 AATAAA GTCAT7AT7AGGAAG77PAAAAGCAT7GC77 

1901 TTATAATGAACTJAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

F1g. 2. The nucleotide and amino acid sequences of human PAP. 

A. The amino acid sequence of PAP 1s numbered from the first residue 
of the mature enzyme, and the number of the first amino add 1n each row 1s 
given on the left. The amino add sequences obtained by direct protein 
sequencing are underlined. The nucleotide sequence 5' to codon no. 60 1s 
from the previously reported sequence (20) in which a sequence of Trp-Ile-Trp- 
Pro-Thr-H1s-Pro-Ala was deduced for amino add nos. 34 to 41. Further, Ala 
and Cys were reported for amino add nos. 180 and 340, respectively (20). It 
may also be noted that a Gin Instead of Glu at amino add no. 2 was detected 
previously (10). 

B. The 3' noncodlng sequence of human PAP 1s numbered from the first 
nucleotide after termination codon TAG , and the number of the first nucleotide 
in each row 1s given on the left. Clones hP7, hP40 and hP38 have poly (A) 
addition sites (Indicated by dots) at nucleotide nos. 646, 1887 and 1913, 
respectively. The dlmer and monomer of Alu-repeats are underlined. The 
previously reported sequence (20) and the sequence determined in this 
investigation exhibit five substitutions at nucleotide nos. 156 (T for A), 
157 (A for T) f 168 (T for C). 385 (T for G) and 1211 (C for G) ; a single 
nucleotide (T) insertion after residue no. 256 and a double nucleotide (TT) 
addition after residue no. 413; eight deletions at nucleotide nos. 417, 1206, 
1210, 1219, 1223, 1227, 1243 and 1842. 
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Fig. 3. Analyses of human genomic blot and chromosome spot-blots using human 
PAP cDNA probe. 

A. The total human genomic DNA was cleaved with restriction 
endonucleases EcoRI (R), PstI (P) or Hindlll (H), and electrophoresed on 
0.75% agarose gel. The DNA fragments were transferred to nitrocellulose 
filter and hybridized with 32P-labeled coding probe of PAP cDNA as indicated 
1n F1g. 1. The estimated sizes of major ONA fragments are: 4.6Kb for EcoRI 
and PstI, and 3.8Kb for Hindlll. 

B. The human chromosomes of each type were flow-sorted directly onto 
nitrocellulose discs as previously described (16). The spot-blots were 
hybridized with the same coding probe used in genomic blot analysis. 



polyadenylatlon signal of AATAAA located at 18 and 44 nucleotides, 
respectively, upstream to. their poly (A) addition sites. Thus, clones hP26, 
hP37, hP16, hP40 and hP38 appeared to have originated from mRNAs in which the 
3' noncodlng sequences contain a dimer and a monomer of Alu-repeti tive 
sequences. The dlmeMc Alu-sequence including 12 repeats of TAAA is flanked 
by direct repeats of AAAGTTGATT (nucleotide nos. 861-1216), while the 
monomeric Alu-sequence is surrounded by direct repeats of AAGGAAG (nucleotide 
nos. 1222-1370). 

Gene mapping: 

The genomic blot analysis of human total DNA using PAP coding probe 
showed a single hybridization band 1n PstI cut DNA (F1g. 3A). The detection 
°f a strong band and a weak band in the DNAs of either EcoRI or Hindlll 
cleavage suggested the presence of an intron within the cDNA region used as a 
Probe, since this cDNA sequence contains neither an internal EcoRI nor a 
Hindi 1 1 site. The results of chromosome spot-blot hybridization with the same 
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PAP coding probe indicated that only chromosome 3 exhibits a positive signal 
(Fig. 3B). Thus, the gene coding for human PAP is located on chromosome 3. 



Discussion 

The complete amino acid sequence of purified human PAP was determined by 
sequencing both protein and cDNA, and the mature enzyme contains a sequence of 
354 amino acids, including 10 Ala, 15 Arg, 10 Asn, 16 Asp, 5 Cys, 17 Gin, 29 
Glu, 19 Gly, 13 H1s t 14 He, 44 Leu, 19 Lys, 10 Met, 15 Phe, 25 Pro, 27 Ser, 
24 Thr, 6 Trp, 19 Tyr and 17 Val. Three putative N-l inked glycosyl ation sites 
of PAP were found at Asn-Glu-Ser (residues 62-64), Asn-Phe-Thr (residues 
188-190) and Asn-Glu-Thr (residues 301-303). The glycosylation at these three 
potential sites of PAP may explain the difference between the estimated 50 kDa 
of glycosylated subunits (6) and the calculated Mr of 40,939 for mature 
polypeptide chains. 

Human PAP mRNAs have heterogeneous sizes of 3" noncoding region. Clone 
hP7 contain only 646 nucleotides of the 3 r noncoding sequence; while clones 
hP40 and hP38 have 3' noncoding regions of 1887 and 1913 nucleotides, 
respectively. A dimer and a monomer of Alu-repeats are present in the longer 
3' noncoding sequences, and these two Alu-repeats exhibit approximately 80% 
homology with the consensus Alu-sequence (18). It is of interest that an 
Alu-repeat is also present 1n the 3' noncoding region of the cDNA encoding 
for human placental alkaline phosphatase (19). However, the biological 
significance of the Alu-repeats in the 3' noncoding region of PAP mRNAs 
remains to be determined. 

This investigation found that none of the isolated 30 PAP cDNA clones 
contain the coding sequence 5' to codon no. 60, while several PAP cDNAs 
truncated at the same AAATTC site of codon no. 59-60. This result may 
indicate that EcoRI restriction endonuclease cleaves the EcoRI-like sequence, 
but EcoRI methylase did not protect this sequence during the cDNA cloning 
(9). Thus, a different procedure without using EcoRI restriction endonuclease 
must be utilized to establish the cDNA library in order to obtain full-length 
PAP cDNA. 

While this work was in progress, human PAP cDNA was independently cloned 
(20). A comparison of protein sequence revealed 10 amino acid differences at 
residue nos. 34 to 41, 180 and 340. The differences between the sequence of 
Gly-Phe-Gly-Gln-Leu-Thr-Gln-Leu obtained by direct protein sequencing in this 
investigation and the predicted sequence of Trp-I le-Trp-Pro-Thr-His-Pro-Ala 
can simply be due to an insertion of T at codon no. 34 and deletion of G at 
codon no. 41 in the reported cDNA sequence (20). It should also be noted that 
the amino acid sequence of reside nos. 34 to 39 obtained by direct protein 
sequencing is identical to the corresponding sequence of human LAP (21) as 

\ 
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Fig. 4. Comparison of amino acid sequences of human PAP and LAP. 

The amino acids of LAP (21) different from PAP are given; while those 
identical residues are indicated by hyphen. The open triangle means the 
addition/deletion in PAP and LAP sequences. The conserved Cys residues are 
shown by dots, while the potential N-linked glycosylation sites are labeled by 
star. The transmembrane peptide of LAP is denoted by m. 



indicated in Figure 4. The differences at amino acid nos. 180 (Pro vs Ala) 
and 340 (Val vs Cys) may represent genetic polymorphisms in human population. 
At the 3' noncoding region of 1913 nucleotides (Fig. 2B) there are 5 nucleotide 
substitutions and 11 additions/deletions between the reported sequence (20) 
and the sequence determined in this investigation. 

The gene coding for PAP was mapped onto human chromosome 3, and this 
gene is designated as Acp-3. The Acp-1 gene coding for red blood cell acid 
phosphatase is located on chromosome 2; the Acp-2 gene coding for lysosomal 
acid phosphatase (LAP) is on the short arm of chromosome 11 (21-23). Recently, 
human LAP was reported to contain 393 amino acids, including a transmembrane 
sequence (21). There is 50% (177/351) homology between the corresponding 
amino acid sequences of human PAP and LAP (Fig. 4). All five Cys residues 
as well as the first and third potential N-linked glycosylation sites of PAP 
sequence were conserved in the LAP sequence. Human PAP and LAP precursor 
oroteins contain hydrophobic signal peptides of 32 and 30 amino acids, 
respectively (20-21). Both PAP and LAP enzymes are sensitive to inhibition by 
'--tartrate (6,21). These structural and functional similarities suggest that 
the genes coding for human prostatic and lysosomal acid phosphatases belong 
to a multi-gene family originated from an ancestral gene during the course of 
evolution. 
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